Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
°íÇ°Áú ºòµ¥ÀÌÅÍ ºÐ¼®À» À§ÇÑ ÃÖÀûÀÇ Àüó¸® ¼ø¿ Ãßõ ¹æ¹ý |
¿µ¹®Á¦¸ñ(English Title) |
A Recommendation Scheme for an Optimal Pre-processing Permutation Towards High-Quality Big Data Analytics |
ÀúÀÚ(Author) |
±è¼ºÇö
¼¿µ±Õ
Źº´Ã¶
Seounghyun Kim
Young-Kyoon Suh
Byungchul Tak
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 47 NO. 03 PP. 0319 ~ 0327 (2020. 03) |
Çѱ۳»¿ë (Korean Abstract) |
¿À´Ã³¯ Æø¹ßÀûÀÎ µ¥ÀÌÅÍÀÇ Áõ°¡·Î, ´Ù¾çÇÑ ºÐ¾ß¿¡¼ ºòµ¥ÀÌÅÍ ºÐ¼®À» ÅëÇÑ Áö´É ¼ºñ½º ¿¬±¸°¡ È°¹ßÈ÷ ÁøÇà ÁßÀÌ´Ù. µ¥ÀÌÅÍ ¸¶ÀÌ´× ¶Ç´Â ±â°è ÇнÀÀ» ÅëÇÑ ºòµ¥ÀÌÅÍ ºÐ¼®Àº ÇнÀ µ¥ÀÌÅÍ¿¡ ´ëÇÑ Àü󸮰¡ ÇʼöÀûÀÌ´Ù. ÁÖ¾îÁø µ¥ÀÌÅÍ¿¡ ´ëÇÑ ºÒ¿ÏÀüÇÏ°í ºÎÀûÀýÇÑ Àü󸮴 ½Å·ÚÇϱâ Èûµç ºÐ¼® °á°ú¸¦ ³ºÀ» ¼ö ÀÖÀ½¿¡µµ ºÒ±¸ÇÏ°í, »ç¿ëÀÚ°¡ ÃÖ»óÀÇ °á°ú¸¦ µµÃâÇÒ ¼ö ÀÖ´Â Àüó¸® ÇÔ¼öµé¿¡ ´ëÇÑ ÃÖÀûÀÇ ÁýÇÕ ¹× ±× ¼ø¼¸¦ ¼±ÅÃÇÏ´Â °ÍÀº ¾î·Æ´Ù. ÀÌ·¯ÇÑ ¹®Á¦¸¦ ¿ª¼³Çϱâ À§ÇØ, º» ³í¹®¿¡¼´Â »ç¿ëÀÚ°¡ Á¦°øÇÑ µ¥ÀÌÅÍ¿¡ ÃÖÀûÈµÈ Àüó¸® ÇÔ¼öµéÀÇ ¼ø¿À» ºÐ¼®ÇÏ°í ÃßõÇÏ´Â Ç÷§ÆûÀ» ¼³°èÇÏ°í ±¸ÇöÇÏ¿´´Ù. Á¦¾ÈµÈ Ãßõ ¹æ¹ýÀ» ½Ç¼¼°è µ¥ÀÌÅ͸¦ »ç¿ëÇÏ¿© Æò°¡ÇÑ °á°ú´Â ÃÖÀûÀÇ Àüó¸® ¼ø¿Àº ÃÖ¾ÇÀÇ Àüó¸® ¼ø¿°ú ºñ±³ÇÏ¿© Á¤È®µµ Ãø¸é¿¡¼ °¡Àå ¶Ù¾î³ ¼º´ÉÀ» º¸ÀÌ°í ÀÖÀ½À» ÀÔÁõÇÑ´Ù. »ç¿ëÀÚ´Â º» ³í¹®ÀÌ Á¦¾ÈÇÏ´Â ¹æ¹ýÀ» Àû¿ëÇÏ¿© ÃÖ»óÀÇ Àüó¸® ¼ø¿À» ¼±ÅÃÇÒ ¼ö ÀÖ¾î °íÇ°Áú ºòµ¥ÀÌÅÍ ºÐ¼® °á°ú¸¦ ¾òÀ» ¼ö ÀÖÀ» °ÍÀ¸·Î ±â´ëµÈ´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
Today, due to the explosive increase in data, intelligent service research through big data analysis has been actively conducted in various domains. Pre-processing of training data is essential to big data analytics via data mining or machine learning. Although incomplete and inadequate pre-processing for a given dataset can result in unreliable analysis, it is challenging for users to choose the optimal set and sequence of pre-processing functions that leads to the best results. To address this problem, we have designed and implemented a pre-processing evaluation platform that can analyze the performance of a various permutation of pre-processing functions for a given user dataset and then recommend the best permutation. Evaluation results using the real-world dataset demonstrates that the recommended pre-processing permutation yields the best performance in terms of accuracy when compared to the worst pre-processing permutation. By applying the method proposed in this paper, users can choose the best preprocessing permutation, thus being expected to obtain high-quality big data analysis results
|
Å°¿öµå(Keyword) |
Àüó¸® ¼ø¿
Àüó¸® ÃÖÀûÈ
ºòµ¥ÀÌÅÍ Àüó¸®
pre-processing permutation
Àüó¸® Ãßõ
pre-processing optimization
big data pre-processing
pre-processing recommendation
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|